An Improved HITS Algorithm Based on Page-query Similarity and Page Popularity

نویسندگان

  • Xinyue Liu
  • Hongfei Lin
  • Cong Zhang
چکیده

The HITS algorithm is a very popular and effective algorithm to rank web documents based on the link information among a set of web pages. However, it assigns every link with the same weight. This assumption results in topic drift. In this paper, we firstly define the generalized similarity between a query and a page, and the popularity of a web page. Then we propose a weighted HITS algorithm which differentiates the importance of links with the querypage similarities and the popularity of web pages. Experimental results indicate that the improved HITS algorithm can find more relevant pages than HITS and improve the relevance by 30%-50%. Furthermore, it can avoid the problem of topic drift and enhance the quality of web search effectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to Feature Selection Using PageRank algorithm for Web Page Classification

In this paper, a novel filter-based approach is proposed using the PageRank algorithm to select the optimal subset of features as well as to compute their weights for web page classification. To evaluate the proposed approach multiple experiments are performed using accuracy score as the main criterion on four different datasets, namely WebKB, Reuters-R8, Reuters-R52, and 20NewsGroups. By analy...

متن کامل

A Score based Web Page Ranking Algorithm

With the explosive growth of information in the Web, users face difficulties while finding their desired information. Search engine helps the user by retrieving useful information from this huge collection based on his/her search query and presents a list of relevant web pages as a search result. However, without proper ranking of pages in the result through the relevancy of pages to the search...

متن کامل

A Review Paper on Page Ranking Algorithms

Page Rank is extensively used for ranking web pages in order of relevance by mostly all search engines world-wide. There are many algorithms for page ranking such as Google Page Rank algorithm, Hyperlink-Induced Topic Search (HITS) algorithm etc. Some search engine uses link structure based page ranking algorithm while some uses content based. The page ranking algorithm reflects the popularity ...

متن کامل

An Improved Page Rank Algorithm based on Optimized Normalization Technique

Page Ranking is an important component for information retrieval system. It is used to measure the importance and behavior of web pages. We review two approaches for ranking: HITS concept and Page Rank method. Both approaches focus on the link structure of the Web to find the importance of the Web pages. The Page Rank algorithm calculates the rank of individual web page and Hypertext Induced To...

متن کامل

Related Packet Padding for Anonymous Web Browsing in Mobile Devices against Traffic Analysis Attack

Anonymous web browsing is becoming more popular to meet web privacy protection. To meet anonymity, we propose related packet padding strategy in which web page related to user request is selected as cover page based on the popularity for anonymous web browsing systems. Earlier predicted packet strategy was used for anonymous web browsing systems in which web page based on popularity is selected...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCP

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012